C OMPUTATION AND A NALYSIS A thesis presented in partial fulfilment of the requirements

نویسنده

  • Atheer Matroud
چکیده

Biological sequences have long been known to contain many classes of repeats. The most studied repetitive structure is the tandem repeat where many approximate copies of a common segment (the motif ) appear consecutively. In this thesis, a complex repetitive structure is investigated. This repetitive structure is called a nested tandem repeat. It consists of many approximate copies of two motifs interspersed with one another. This thesis is a collection of published and in progress papers. Each paper addresses a computational problem related to the analysis of nested tandem repeats. Nested tandem repeats have been observed in the intergenic spacer of the ribosomal DNA gene in Colocasia esculenta. The question of whether such repeats can be found elsewhere in biological sequence databases is addressed and NTRFinder, a software tool to detect nested tandem repeats, is described. Another problem that arises after detecting a nested tandem repeat is the alignment of the nested tandem repeat region against its two motifs. An algorithm that guarantees an optimal solution to this problem is introduced. After detecting nested tandem repeats and identifying their structures, the identification of the motif boundaries is an unsolved problem which arises not only in nested tandem repeats but in tandem repeats as well. Heuristic solutions to this problem are implemented and tested. In order to compare two tandem repeat sequences an algorithm that aligns a hypothetical ancestral sequence of both sequences against each sequence is presented. This algorithm considers substitutions, deletions, and unidirectional duplication, namely, from ancestor to descendant.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Logical Foundations of Staged Computation (Invited talk)

Dividing a omputation into stages and optimizing later phases using information from earlier phases is a familiar te hniquein algorithm design. In the realm of programming languages, staged omputation has found two important realizations:partial evaluation and run-time ode generation. A priori, these are fundamentally operational on epts, on erned with howa program exe utes, but...

متن کامل

Distributed Scientific Computation With JavaSpaces?

Java-based tuplespa es provide a new avenue of exploration for distributed omputing with ommodity te hnology. This paper presents early results from our investigation of JavaSpa es for s ienti omputation. We dis uss weaknesses as revealed by our attempts to map existing parallel algorithms to the JavaSpa es model, and use low-level metri s to argue that several lasses of problems are not eÆ ien...

متن کامل

Counting Complexity Classes

Shripad Thite May 11, 1998 Abstra t The ounting omplexity lasses are de ned in terms of the number of a epting omputation paths of nondetereministi polynomial-time Turing ma hines. They are, therefore, the ounting versions of de ision problems in NP. We review the properties of well-known ounting lasses like #P, P, GapP, SPP et . We also give an overview of the proof of Toda's theorem that rela...

متن کامل

A C Lustering H Euristic for M Ultiprocessor E Nvironments Using C Omputation and C Ommunication L Oads of M

In this paper, we have developed a heuristic for the task allocation problem on a fully connected homogeneous multiprocessor environment. Our heuristic is based on a value associated with the modules called the Computation-Communication-Load (CCLoad). This value is dependent on the computation and the communication times associated with the module. Using the concept of CCLoad, we propose a clus...

متن کامل

Omputation and D Ecision - M Aking in L Arge E Xtensive F Orm G Ames

In this thesis, we investigate the problem of decision-making in large two-player zero-sumgames using Monte Carlo sampling and regret minimization methods. We demonstrate fourmajor contributions. The first is Monte Carlo Counterfactual Regret Minimization (MC-CFR): a generic family of sample-based algorithms that compute near-optimal equilibriumstrategies. Secondly, we develop a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013